# Knowledge distillation

Openbuddy OpenBuddy R1 0528 Distill Qwen3 32B Preview0 QAT GGUF
Apache-2.0
This is the quantized version of OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview0-QAT. With quantization technology, the model can run more efficiently under different hardware conditions.
Large Language Model Supports Multiple Languages
O
bartowski
720
1
Ultralong Thinking
An 8B-parameter language model merged using the SLERP method, combining the strengths of DeepSeek-R1 and Nemotron-8B models
Large Language Model Transformers
U
mergekit-community
69
2
Distill Any Depth Large Hf
MIT
Distill-Any-Depth is a new SOTA monocular depth estimation model trained using knowledge distillation algorithms.
3D Vision Transformers
D
xingyang1
2,322
2
Llama DNA 1.0 8B Instruct
A state-of-the-art bilingual language model based on the Llama architecture, specially optimized for Korean understanding and generation while maintaining strong English capabilities.
Large Language Model Transformers Supports Multiple Languages
L
dnotitia
661
58
Koala Lightning 1b
KOALA-Lightning-1B is a knowledge distillation model based on SDXL-Lightning. It achieves efficient text-to-image generation by compressing the U-Net structure, with a parameter scale of 1.16B.
Text-to-Image
K
etri-vilab
390
7
Protgpt2 Distilled Tiny
Apache-2.0
A distilled version of ProtGPT2, compressed into a more efficient small model through knowledge distillation, maintaining performance while improving inference speed
Protein Model Transformers
P
littleworth
157
4
Splade PP En V2
Apache-2.0
Implementation of the SPLADE++ model optimized for industrial scenarios, balancing retrieval quality and efficiency, and supporting document expansion and sparse representation learning
Text Embedding Transformers English
S
prithivida
181
13
Mmlw Retrieval Roberta Large
Apache-2.0
MMLW (I Must Get Better Messages) is a neural text encoder for Polish, optimized for information retrieval tasks.
Text Embedding Transformers Other
M
sdadas
237.90k
12
Mmlw Retrieval Roberta Base
Apache-2.0
MMLW (I Must Get Better News) is a Polish neural text encoder optimized for information retrieval tasks, capable of converting queries and passages into 768-dimensional vectors.
Text Embedding Transformers Other
M
sdadas
408
1
Bk Sdm Small
Openrail
BK-SDM is a stable diffusion model compressed through architecture compression, used for efficient and general text-to-image synthesis. It achieves lightweight design by removing residual and attention blocks in the U-Net.
Image Generation
B
nota-ai
745
31
LEALLA Large
Apache-2.0
LEALLA is a collection of lightweight, language-agnostic sentence embedding models supporting 109 languages, distilled from LaBSE. Suitable for multilingual sentence embeddings and bilingual text retrieval.
Text Embedding Supports Multiple Languages
L
setu4993
37
8
LEALLA Small
Apache-2.0
LEALLA-small is a lightweight, language-agnostic sentence embedding model that supports 109 languages, suitable for multilingual sentence embedding and bilingual text retrieval tasks.
Text Embedding Supports Multiple Languages
L
setu4993
41
14
Distil Ita Legal Bert
A lightweight BERT model for the Italian legal domain built using knowledge distillation technology, featuring only 4 Transformer layers
Text Embedding Transformers
D
dlicari
353
0
Efficient Splade V Large Query
The efficient SPLADE model is used for paragraph retrieval. It adopts a dual-model architecture to handle query and document inference respectively, and performs excellently on the MS MARCO dataset.
Text Embedding Transformers English
E
naver
540
4
Kominilm
KoMiniLM is a lightweight Korean language model designed to address latency and capacity limitations of large language models in practical applications.
Large Language Model Transformers
K
BM-K
244
2
Tinybert General 4L 312D De
This is a TinyBERT model optimized for German, created by distilling the BERT base cased model, suitable for natural language processing tasks.
Large Language Model Transformers German
T
dvm1983
269
3
Frugalscore Medium Bert Base Mover Score
A lightweight text evaluation model based on knowledge distillation, using a small student model to mimic the scoring behavior of a large teacher model
Large Language Model Transformers
F
moussaKam
43
0
Bert Base Uncased Squadv1.1 Sparse 80 1x4 Block Pruneofa
Apache-2.0
This model is a BERT-Base fine-tuned for QA tasks, using 80% 1x4 block-sparse pre-training combined with knowledge distillation.
Question Answering System Transformers English
B
Intel
27
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase